Secondary structure motif determination in ncRNA via graph kernel based computational models
نویسندگان
چکیده
ncRNA which is a functional molecule but yet not translated into protein has significantly taken importance in the field of bioinformatics, therapeutics chemoinformatics and for the advancement of science. The nucleotide composition and its structure (identity of paired and unpaired nucleotides) determine the function of ncRNA. Key analytical tools such as folding, alignment and clustering algorithms rely on energetic considerations to generate the accurate response to specific queries as they are designed. In reality, these algorithms become inaccurate while considering the non-linear effects with underlying assumptions (energy additivity), when violated. To overcome these eventualities, one can formulate key parameters in terms of nonlinear functional dependencies that can be learned from known examples (or parts of examples) or from suboptimal RNA structure prediction. Given the importance of the structural element in ncRNA these methods should ideally be able to work in structured domains i.e. they should be able to accept input graph data structures. The methods will belong to the family of kernel machines, since this class of algorithms allows to use heterogeneous features and to accept complex instances such as sequences of graphs as input. The aim of the thesis is to develop computation model capable of identifying subgraphs within the ncRNA folding graph that are characteristic of biological functions. Further subject them to kernel models to improve the RNA secondary structure and its prediction in terms of accuracy.
منابع مشابه
Rnav: Non-coding Rna Secondary Structure Variation Search via Graph Homomorphism
Non-coding RNA (ncRNA) secondary structural homologs can be detected effectively in genomes with profile-based search methods. However, due to the lack of appropriate ncRNA structural evolution models, it is difficult to accurately detect distant structural homologs, i.e., ncRNA structures with variations caused by evolutionary changes such as the insertion or deletion of a substantial portion ...
متن کاملA Computational Pipeline for High- Throughput Discovery of cis-Regulatory Noncoding RNA in Prokaryotes
Noncoding RNAs (ncRNAs) are important functional RNAs that do not code for proteins. We present a highly efficient computational pipeline for discovering cis-regulatory ncRNA motifs de novo. The pipeline differs from previous methods in that it is structure-oriented, does not require a multiple-sequence alignment as input, and is capable of detecting RNA motifs with low sequence conservation. W...
متن کاملncRNA discovery and functional identification via sequence motifs
Non-coding RNAs play regulatory roles in gene expression via establishing stable joint structures with target mRNAs through complementary sequence motifs. Sequence motifs are also important determinants of the structure of ncRNAs. Here we introduce two computational tools that both exploit differential distributions of short sequence motifs in ncRNAs for the purpose of identifying their loci an...
متن کاملJournal of Integrative Bioinformatics
Non-coding RNAs (ncRNAs) contain both characteristic secondary-structure and short sequence motifs. However, “complex” ncRNAs (RNA bound to proteins in ribonucleoprotein complexes) can be hard to identify in genomic sequence data. Programs able to search for ncRNAs were previously limited to ncRNA molecules that either align very well or have highly conserved secondary-structure. The RNAmotif p...
متن کاملGrammar string: a novel ncRNA secondary structure representation
Multiple ncRNA alignment has important applications in homologous ncRNA consensus structure derivation, novel ncRNA identification, and known ncRNA classification. As many ncRNAs’ functions are determined by both their sequences and secondary structures, accurate ncRNA alignment algorithms must maximize both sequence and structural similarity simultaneously, incurring high computational cost. F...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011